This document is the preregistration for the Preferential Physics project on Lookit. Data collection is currently underway, but sample size has been set prior to data collection kickoff (50 participants reaching 12th session), and the vast majority of videos haven’t even been coded for looks, much less critical DVs calculated.
This document contains a brief orientation to the project, links to relevant other documentation, and a reproducible set of analyses that reads in a set of pilot data and attempts to set up the model comparisons we actually care about!
It functions as a preregistration in the sense that, the final dataset will be run through this analytic code to produce the visualizations and models shown here; any additional analyses added later can be treated as exploratory. Where we have ‘decision trees’ or some aspect of the analysis is conditional on features of the final dataset, we note this where we know about it and specify how we’ll make those decisions.
If you’d like to skip straight to the preregistration details, start at the Data Contents section.
The idea is to use dense sampling of individual infants on Lookit to conduct a detailed assessment of understanding of several physical principles. How stable are individual components across sessions and how independent? What does partial knowledge look like at the individual level?
We are interested in infants’ preferential looking ratio to simple violations of:
Gravity: Completely unsupported objects should fall down immediately, rather than moving up, continuing in their current trajectory, or moving down at some delay.
Inertia: Objects should continue roughly in their current trajectory when gravity is not a factor, rather than stopping and starting or turning around.
Support: which of the following should fall (vs. stay put) after being placed?
An object placed mostly on the anchor
An object placed only slightly on the anchor
An object touching the side or bottom of the anchor
An object near the anchor but not touching it
Why study individual behavior in more detail–what do we lose in studying groups of kids? We don’t know the extent to which a success means that a significant fraction of kids this age can do this task VS that all kids of this age can do this task to a significant degree. And even looking at the distribution of scores doesn’t clear this up, unless it’s an incredibly strong result (e.g. all kids get 9-10 of 10 questions right, or no kids get more than 6 out of 10). This matters for
understanding how abilities are related to each other: we can’t get nearly as much out of age-based progressions without knowing how the noise works—especially for results of the form “n-month-olds, but not m-month-olds, can do X”
understanding what partial knowledge & mechanisms of change in a domain look like: when kids “fail,” is that some kids making a correct prediction and some making an incorrect prediction? Or are they all failing to make any prediction and/or predicting at chance? When kids succeed but not at ceiling, are some getting one aspect and some another?
It may be that some kids don’t express nearly-universal knowledge on the dependent measures we collect. We can evaluate this explanation for noise and/or individual differences by studying the methods themselves, and kids’ behavior on them: e.g., can we predict the types of preferential looking responses we get from kids based on control tasks? How stable are those controls and task performance across sessions? (Especially interesting would be differences within kids in expression: kids may genuinely express knowledge at some times, but not others, due to attentional/emotional state changes.)
Especially in development, where we’re interested in the underpinnings of human cognition, the difference between “some babies use this type of information but others do something else” and “all babies have this type of information available to them, unless something’s wrong” matters a lot—this is exactly where we care about universality.
Children complete 24 20-second preferential looking trials per session; families are encouraged to complete 15 sessions within 2 months. Parents complete a short mood survey and go through some instructions before the preferential looking portion. They’re asked to hold their children looking over their shoulders during this portion to avoid parental bias. Parents can end the study at any point and skip to the post-study survey.
Each trial begins with an object intro (video of Kim saying “Look, this is a …”) and demonstrating use of an object – e.g. biting into an apple, putting on hand lotion, drawing with a marker, eating with a spoon) that lasts about 5 seconds.
This is intended as an attention-getter to re-orient children towards the center, while in principle reinforcing that the object in question is not an agent and should be expected to follow normal physical laws.
Then, two events involving that object are shown simultaneously, one on the left and one on the right, looping for 24s (event videos range from about 2-5s). Events always show the same object, same camera angle, same background, with a difference only in the “outcome.” Event types are shown in the table below; each concept is presented 4-6 times, with 2-3 repetitions of each event type. Although events are short, they loop continuously for 20s. Realtime events are shown so that “expected” events are at natural speeds, and not potentially seen as violating physical principles due to happening too slowly.
Parents can pause individual trials. If they pause during the intro, they just start over upon restarting. If they pause during the test (up to one time per trial) they restart from the intro, but then the left and right test videos are switched for the test phase.
(See below for descriptions of these trial types.) Trials cycle through gravity, inertia+calibration, support, and control (same/salience) pairings during a session; the order of these concepts is chosen from a list and changed (cycling through a list of orders) each session. There are six videos shown in each category in total.
Within each category, objects are assigned to comparison types (e.g. “apple” assigned to “table, down vs. up”) by choosing from a list of acceptable mappings, again incremented per session. (The first session value was selected randomly from the first six options initially [by accident], and is now selected from all possible mappings.)
There are six possible comparisons for the stay and fall events; three comparisons are assigned to stay and three to fall, with the selection again cycling through a random list of such assignments per session. Left/right placement, horizontal flipping of the left and right events, camera angles, and backgrounds are chosen randomly with the constraint that half of the ‘more probable’ events are on the left within each category. Calibration trials (grouped with inertia videos for purposes of assigning object intros) are placed at trials 3 and 6, so that they are always available for kids who completed enough trials for the session to be included (and so that if there are differences in coding quality across trials, we’re not excluding on the basis of when calibration happened).
Object is rolled/slid off a table and continues down, horizontal, or up.
Object in hand is tossed down, falls UP, tossed up, falls DOWN.
Object is placed in center of ramp and released to roll down or up.
Object rolls from one side of screen and stops in the middle, then re-starts on its own, or by a hand.
Object rolls/slides from one side of the screen and collides with a barrier, or takes the same trajectory colliding against no bairrier.
An object is placed (mostly on/slightly on/next to/near) on a cabinet and immediately falls.
An object is placed (mostly on/slightly on/next to/near) on a cabinet and stays there.
Distinguishable but similar physically-possible human actions on objects, like rotating an object about one axis vs. another
Physically-possible human actions on objects, some more interesting, like flipping a spoon vs. slowly extending it or erasing a drawing vs. an empty board.
A spinning ball moves across the entire screen.
Here we describe the information that will be available in the final dataset. Some of this info is not available for the pilot dataset, in particular anything having to do with more than one session.
Age at start
Demographic info (optional, typically reported): family income, languages spoken at home, parent education level(s), number of parents in the home, number of children’s books in the home, child’s race, number and age of siblings, country + US state if in US, urban/suburban/rural
Child’s age
Number of previous sessions completed
Time since last session
Mood data (Before beginning study, by parent report. Scales 1-7: CHILD: tired-rested, sick-healthy, fussy-happy, calm-active; PARENT: tired-energetic, overwhelmed-ontopofthings, upset-happy. How long since child woke up, how long since child ate, how long until child is due for nap/sleep; what child was doing before this.)
Time looking L, R (full time sequence of looks left, right, away)
Number of fixations (derived from the first one)
Proportion looking to L, out of time looking to screen (derived from the first one)
Parent behavior: times of talking, pointing, and peeking
Infant behavior: times of fussing & rating of fussiness level low or high
Trial number
Comparison type (e.g. ramp up vs. down), nested within
Event type (e.g. ramp), nested within…
Concept (gravity, inertia, support)
Object (apple, lotion bottle, scissors…)
Whether each side is unexpected (i.e., does the event on the left clearly violate a physical principle? Does the event on the right? Sometimes both are unexpected to adults, e.g. when an object near a cabinet and an object next-to a cabinet stay put; sometimes neither is. In some cases this depends on the child’s potential beliefs, see modeling…) & which side is less expected.
Age range 4-12 months at start of study; continue for up to 61 days; target ‘complete’ dataset is 12 usable sessions. (No major age differences in data quality or salience/same controls seen in piloting.)
Plan to recruit as large a balanced sample as practical given time constraints on both testing and recruiting; aiming for 50 participants with a complete dataset. All recruitment decisions are / have been made without examining dependent variables. This will also be the case if we must terminate data collection early (i.e. for timing/funding constraints)
All partial datasets (<12 sessions) and any extra data collected will be included in the analysis. Data will only be excluded from analysis if it meets any of the criteria below; we aim to include as much data as possible and use analyses that are robust to missing/‘unbalanced’ data. Data may be excluded at the level of the participant (i.e. all data from that child is excluded), sessions, or individual trial.
Note also that because (a) we may exclude some-but-not-all data from a participant and (b) the number of times baby is taken to have seen the study is relevant, the ‘study number’ (= number of sessions seen) may differ from a simple count of the sessions that are present in the dataset.
Gestational age at birth < 37 weeks, for any analyses using age. Unknown gestational age will be used but prevalence reported. (Followup to check that inclusion of premature infants does not qualitatively affect other results, and/or to display results from premature infants with adjusted/non-adjusted age - exploratory.)
Children who participated in the pilot study
Children whose parents spontaneously report developmental/medical issues that would likely explain some differences in task: vision or hearing impairment; cognitive or neurological disorders including due to trisomies.
Note again: We include data from children with any number of sessions (will probably have many with <12 in addition to “complete” data); analyses described should be able to handle this appropriately.
Sessions where children are outside age range of 4-14 months, except for binned-by-age analyses where we may display data from children outside the age range (without affecting any other values) if we end up having it. Adjusted age will be used for premature infants.
Sessions with < 6 trials (& don’t count as a session for session # purposes). Parents are encouraged not to complete sessions within 6 hours of each other. For each session (process later -> earlier), if another session with fewer completed trials happened within six hours, use this one instead.This leads to reasonable outcomes even in the unlikely event someone’s doing the study every 5 hours.
Require calibration performance >75% to use session. Calibration scores seem to be mostly due to difficulty coding and might therefore index overall confidence in other judgments; timing differences in webcam stream vs. displayed stimuli will also affect calibration. Pool all looking across the two calibration trials to compute an overall calibration score, so that if kids aren’t looking as much for one of the trials we don’t average in a much noisier measurement. Score is fractional looking time to correct side during the middle of periods when the ball should be static: [0.5, 3.5], [5.5, 8.5], [10.5, 13.5], [15.5, 18.5], [20.5, 23.5].
If the participant has <12 usable sessions spread over >60 days, use sessions from the earliest 60-day period (inclusive) with the most usable sessions. If the participant has >=12 usable sessions over >60 days, use sessions from the earliest 60-day period (inclusive) with at least 12 sessions.
Where absolute session number (as an index of how many times the child has seen stimuli, etc.) is relevant, assign the first session used in analysis a session number according to the number of ‘experienced sessions’ in the preceding 60 days. Experienced sessions include sessions where the child is out of age range or calibration performance is poor, but not sessions with <6 trials. (For instance, if a child participates at 4 months of age and then 12 times from 10-11 months, the latter set of data is used and the first session at 10 months is considered session 1. If a child participates on days 1, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, then sessions starting at day 30 are used but the first session number is 2.
Require >= 2s looking to use a trial. Don’t otherwise deal with shorter/longer LTs except in (exploratory) model.
Omit periods where the child is out of frame or gaze is otherwise impossible to code. Treat as out of frame any periods where the video is ‘frozen’ for >1s (start treating as out of frame 1s after this period begins) and report prevalence.
Exclude trials where child is fussing >50% of the trial. In contrast to omitting periods where child is fussing, this avoids dependence on exactly which frames are considered fussy, at the cost of a threshold effect we expect not to affect many trials. >50% applies to length of video, not otherwise codable data (i.e. we allow fuss coding during outofframe periods).
Parent interference: Exclude periods of trials after parent peeks, points while peeking, or speaks in any way that could bias child (“what’s that ball doing?” but not “keep looking sweetie!”) Include periods where the parent’s eyes are not visible and we can’t tell where they’re looking (unless there is reason to believe they are looking) and periods where the parent is looking away but may see stimuli peripherally.
This is a pilot dataset, it consists of (cleaned, post-data exclusion, post trial-proportion calculations) a set of subjects who participated in one physics session that is similar to the sessions that are being run in the main data collection. The principle difference is that the pilot dataset has exactly 1 session per participant, and the conditions X and Y are absent.
QUESTION FOR KIM: Is it the case that all exclusion criteria have been applied in the pilot dataset? (In particular: criteria for calibration?)
## X shortId child.gender child.additionalInformation
## 1 1 57e4675fc0d9d70060c680ee male
## 2 2 57e4a294c0d9d70061c68140 female
## 3 3 57e56dbdc0d9d70060c681f9 female
## 4 4 57e57274c0d9d70061c681ff male
## 5 5 57e596f6c0d9d70060c68274 female
## 6 6 57e939a7c0d9d70060c68458 female
## ageRegistration ageExitsurvey withdrawn consent
## 1 12.756164 12.756164 FALSE yes
## 2 4.832877 4.832877 FALSE yes
## 3 8.679452 8.679452 FALSE yes
## 4 10.915068 10.915068 FALSE yes
## 5 11.408219 11.408219 FALSE yes
## 6 11.769863 11.769863 FALSE yes
## consentnotes usable allcoders coderComments.Kim
## 1 child present, camera far from screen yes ['Alice'] NA
## 2 no child present yes ['Alice'] NA
## 3 child present yes ['Alice'] NA
## 4 child present yes ['Alice'] NA
## 5 child present yes ['Alice'] NA
## 6 child present yes ['Alice'] NA
## coderComments.Jessica coderComments.Alice
## 1 difficult angle
## 2 a lot of squirming, blurry, suddenly dark once
## 3
## 4 a lot of bouncing
## 5 parent keeps peeking
## 6 blurry, far from camera
## nVideosExpected nVideosFound expectedDuration actualDuration
## 1 25 25 477.778 477.7667
## 2 25 23 432.699 432.7000
## 3 25 23 396.937 396.9333
## 4 25 22 404.554 404.5667
## 5 25 25 462.382 462.3667
## 6 25 22 394.653 394.6333
## exit.survey.withdrawal exit.survey.useOfMedia exit.survey.databraryShare
## 1 FALSE private yes
## 2 FALSE private yes
## 3 FALSE scientific yes
## 4 FALSE scientific yes
## 5 FALSE private yes
## 6 FALSE scientific yes
## exit.survey.feedback mood.survey.active mood.survey.childHappy
## 1 7 6
## 2 7 7
## 3 7 7
## 4 6 4
## 5 7 7
## 6 7 7
## mood.survey.rested mood.survey.healthy mood.survey.doingBefore
## 1 6 7 playing outside
## 2 7 7 Playing in her jungle gym
## 3 6 7 in her jumper
## 4 3 5 Nursing/Playing
## 5 7 7 taking a nap
## 6 2 7 playing
## mood.survey.lastEat mood.survey.napWakeUp mood.survey.nextNap
## 1 60 210 60
## 2 60 30 85
## 3 15 120 NA
## 4 5 126 NA
## 5 15 60 180
## 6 120 150 NA
## mood.survey.usualNapSchedule mood.survey.ontopofstuff
## 1 yes 4
## 2 yes 4
## 3 no 7
## 4 no 7
## 5 yes 7
## 6 no 7
## mood.survey.parentHappy mood.survey.energetic child.ageAtBirth
## 1 6 3 40 or more weeks
## 2 4 4 37 weeks
## 3 7 2 37 weeks
## 4 7 5 38 weeks
## 5 6 7 39 weeks
## 6 7 2 40 or more weeks
## codedby_Alice codedby_Jessica use_coder has_preview videonum calibration
## 1 TRUE FALSE Alice FALSE 1 0.9320944
## 2 TRUE FALSE Alice FALSE 1 NA
## 3 TRUE FALSE Alice FALSE 1 0.8694853
## 4 TRUE FALSE Alice FALSE 1 0.9836244
## 5 TRUE FALSE Alice FALSE 1 NA
## 6 TRUE FALSE Alice FALSE 1 1.0000000
## duration left ooftime right stimuli trialnum
## 1 17000 7053 0 7208 calibration 1
## 2 17000 1721 0 668 sbs_ramp_up_down_apple_c1_b1_RR 1
## 3 17000 6532 0 10468 calibration 1
## 4 17000 6901 0 9466 calibration 1
## 5 17000 13600 0 1367 sbs_same_B_A_funnel_c1_b1_NN 1
## 6 17000 6733 0 10267 calibration 1
## event concept object unexpectedLeft unexpectedRight
## 1 calibration <NA> <NA> NA NA
## 2 ramp gravity apple TRUE FALSE
## 3 calibration <NA> <NA> NA NA
## 4 calibration <NA> <NA> NA NA
## 5 same control funnel FALSE FALSE
## 6 calibration <NA> <NA> NA NA
## lessExpectedLeft outcomeLeft outcomeRight comparison fracLeft
## 1 NA <NA> <NA> <NA> 0.4945656
## 2 TRUE up down down:up 0.7203851
## 3 NA <NA> <NA> <NA> 0.3842353
## 4 NA <NA> <NA> <NA> 0.4216411
## 5 NA B A A:B 0.9086657
## 6 NA <NA> <NA> <NA> 0.3960588
## fracLessExpected fracAbs metric totalLT child.proquint
## 1 NA 0.5054344 0.9320944 14.261 vuton-jikut
## 2 0.7203851 0.7203851 0.7203851 2.389 zobuh-kitob
## 3 NA 0.6157647 0.8694853 17.000 lovol-juhan
## 4 NA 0.5783589 0.9836244 16.367 duliz-zuriz
## 5 NA 0.9086657 0.9086657 14.967 fadot-kopuh
## 6 NA 0.6039412 1.0000000 17.000 buzus-sijif
As a warmup, we’ll begin by visualizing the pilot dataset for some basic properties. First, we expect that babies’ total looking time will drop over time (i.e. later trials have shorter looking times).
Next, we’ll begin plot the critical DV - how much time do babies spend looking at surprising things? Note a complication of the dataset here: we will be modeling the ‘raw’ dependent variable, fracLeft (that is, out of some amount of time staring at the screen, what percentage of that time is spent looking at the left-hand video?). Then in general, we’ll be asking whether this proportion is affected by lessExpectedLeft - is whichever video is most surprising (from an adult point of view) located on the left or the right?
Remember that lessExpectedLeft corresponds to a different video/video pair type for every event and concept. Critically, this value is undefined for some event types, in particular the SAME events.
First, a histogram of distributions:
This resembles a symmetric beta distribution with a ‘bump’ in the middle (or a mixture of multiple parameter settings?. This is actually sensible, because it includes counterbalancing of all ‘surprisal’ events, and because the above contains cases where babies may usually look about equally at both sides (e.g. the SAME condition), and cases where some (but not all!) babies will consistently look more at one video or the other. To test this, let’s try faceting out by event.
Plot again without calibration event so we can see more clearly:
Let’s assume that we’re dealing with beta distributions. Time to build up the model! We’ll use the glmmTMB package.
QUESTION FOR PATRICK: Any reason/limitation of this package to be aware of?
First, set up the family of distributions we’ll be using:
## Family: beta ( logit )
## Formula: fracLeft_tr ~ 1
## Data: pilotdata
##
## AIC BIC logLik deviance df.resid
## -92.5 -82.9 48.3 -96.5 896
##
##
## Overdispersion parameter for beta family (): 1.41
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.04614 0.04002 -1.153 0.249
The primary predictor we are interested in is whether children look longer (or at least a non-chance fraction!) at unexpected events (lessExpectedLeft). To complicate things, this value is undefined for two event types (same, calibration). We’ll leave those out for now (they are used only for the ‘looking personality’ results.)
For the questions about infants’ knowledge of phyics, we’ll limit ourselves to this question about unexpectedness! To model this, we simply add a term to the model:
## Family: beta ( logit )
## Formula:
## fracLeft_tr ~ lessExpectedLeft * ageRegistration * trialnum +
## (1 | shortId) + (1 | stimuli)
## Data: physics_events
##
## AIC BIC logLik deviance df.resid
## -237.4 -189.1 129.7 -259.4 584
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## shortId (Intercept) 0.5255 0.7249
## stimuli (Intercept) 0.2879 0.5365
## Number of obs: 595, groups: shortId, 43; stimuli, 363
##
## Overdispersion parameter for beta family (): 2.3
##
## Conditional model:
## Estimate Std. Error z value
## (Intercept) 1.269287 0.543342 2.336
## lessExpectedLeftTRUE -1.914727 0.609890 -3.139
## ageRegistration -0.115478 0.058335 -1.980
## trialnum -0.065141 0.029651 -2.197
## lessExpectedLeftTRUE:ageRegistration 0.164995 0.066091 2.497
## lessExpectedLeftTRUE:trialnum 0.123222 0.043591 2.827
## ageRegistration:trialnum 0.005095 0.003257 1.564
## lessExpectedLeftTRUE:ageRegistration:trialnum -0.010928 0.004829 -2.263
## Pr(>|z|)
## (Intercept) 0.01949 *
## lessExpectedLeftTRUE 0.00169 **
## ageRegistration 0.04775 *
## trialnum 0.02802 *
## lessExpectedLeftTRUE:ageRegistration 0.01254 *
## lessExpectedLeftTRUE:trialnum 0.00470 **
## ageRegistration:trialnum 0.11769
## lessExpectedLeftTRUE:ageRegistration:trialnum 0.02364 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
NOTE: In the pilot dataset, this actually makes the overall AIC/BIC get worse; this is relatively unsurprising since many of these comparisons yield a finding (when graphed) that children do not seem to be affected by surprisingness! It would also not be terribly shocking to find in the real dataset, since lumping all the event types and ages together is not appropriate.
QUESTION FOR PATRICK: I am used to comparing models with anova(model1, model2), but wonder if there is a better choice. I would like to be able to express confidence that including the term is warranted (e.g. ‘surprisingness matters’). This could be p-values or something else, but would strongly prefer to be using only one (family of) hypothesis-testing metric.
To get actually interpretable models that we can test, first take each event type (Ramp, Fall, Stay, etc.) individually, and (a) ask whether the interaction between surprisingness and age is significant overall for that subset (i.e. repeat the ‘task model’ for each event type, as though they were independent experiments), (b) ask whether there is a significant difference from chance at 12mo, and (c) Display age trends.
QUESTION FOR KIM: You wrote “Could also do mean fLT ~ age + comparisonType + (1|child) or similar, except for concerns about differing correlations among subsets of comparison types.” This is what I did (since both are identical for the single-session version), but can you say more about this?
So, we’ll first set up the common plan, and then execute each!
(This is the template model for all other individual events; just including Salience here for space; analyses will be parallel.)
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in nlminb(start = par, objective = fn, gradient = gr, control =
## control$optCtrl): NA/NaN function evaluation
## Family: beta ( logit )
## Formula:
## fracLeft_tr ~ lessExpectedLeft * ageRegistration * trialnum +
## (1 | shortId) + (1 | stimuli)
## Data: salience
##
## AIC BIC logLik deviance df.resid
## -49.2 -19.3 35.6 -71.2 101
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## shortId (Intercept) 0.4320 0.6573
## stimuli (Intercept) 0.2257 0.4751
## Number of obs: 112, groups: shortId, 42; stimuli, 42
##
## Overdispersion parameter for beta family (): 2.13
##
## Conditional model:
## Estimate Std. Error z value
## (Intercept) 0.151404 1.167804 0.130
## lessExpectedLeftTRUE 0.588075 1.678450 0.350
## ageRegistration -0.089734 0.132795 -0.676
## trialnum 0.018459 0.076027 0.243
## lessExpectedLeftTRUE:ageRegistration 0.059473 0.181586 0.328
## lessExpectedLeftTRUE:trialnum 0.007601 0.119475 0.064
## ageRegistration:trialnum -0.001128 0.008590 -0.131
## lessExpectedLeftTRUE:ageRegistration:trialnum -0.001236 0.013044 -0.095
## Pr(>|z|)
## (Intercept) 0.897
## lessExpectedLeftTRUE 0.726
## ageRegistration 0.499
## trialnum 0.808
## lessExpectedLeftTRUE:ageRegistration 0.743
## lessExpectedLeftTRUE:trialnum 0.949
## ageRegistration:trialnum 0.896
## lessExpectedLeftTRUE:ageRegistration:trialnum 0.924
After predicting whether children become more sensitive to (adult-defined) surprising versions > less surprising versions of each event type, we will ask some further questions that are specific to each domain, and also attempt to discover whether the theoretical clusterings (of comparisons into events, and events into concepts) actually reflect individual childrens’ performances.
For all of these, we’ll start with ‘surprisingness’ model above, and ask whether various additions of the ‘comparison’ factor improve fit.
Gravity, graded judgment: predict table up v down > down v continue > up v continue. Is each pairwise comparison significant?
Method: Filter to each pairwise comparison, ask whether including the comparison type (e.g. up/down vs up/continue) improves model fit.
TABLE event doesn’t exist in the pilot set, so try this with two examples from the ‘fall’ event instead:
## Warning in fitTMB(TMBStruc): Model convergence problem; non-positive-
## definite Hessian matrix. See vignette('troubleshooting')
## Warning in sqrt(diag(vcov)): NaNs produced
## Family: beta ( logit )
## Formula:
## fracLeft_tr ~ lessExpectedLeft * ageRegistration * trialnum +
## (1 | shortId) + (1 | stimuli)
## Data: fall_mo_vs_near
##
## AIC BIC logLik deviance df.resid
## NA NA NA NA 2
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## shortId (Intercept) 7.677e+00 2.771e+00
## stimuli (Intercept) 6.989e-11 8.360e-06
## Number of obs: 13, groups: shortId, 7; stimuli, 11
##
## Overdispersion parameter for beta family (): 1.08e+03
##
## Conditional model:
## Estimate Std. Error z value
## (Intercept) 9.307063 3.451196 2.697
## lessExpectedLeftTRUE -0.109486 NA NA
## ageRegistration -0.957806 0.351433 -2.725
## trialnum -0.695669 0.090908 -7.652
## lessExpectedLeftTRUE:ageRegistration -1.255866 NA NA
## lessExpectedLeftTRUE:trialnum 0.622998 NA NA
## ageRegistration:trialnum 0.060190 0.007336 8.204
## lessExpectedLeftTRUE:ageRegistration:trialnum 0.018815 NA NA
## Pr(>|z|)
## (Intercept) 0.00700 **
## lessExpectedLeftTRUE NA
## ageRegistration 0.00642 **
## trialnum 1.97e-14 ***
## lessExpectedLeftTRUE:ageRegistration NA
## lessExpectedLeftTRUE:trialnum NA
## ageRegistration:trialnum 2.32e-16 ***
## lessExpectedLeftTRUE:ageRegistration:trialnum NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Warning in nlminb(start = par, objective = fn, gradient = gr, control =
## control$optCtrl): NA/NaN function evaluation
## Warning in fitTMB(TMBStruc): Model convergence problem; non-positive-
## definite Hessian matrix. See vignette('troubleshooting')
## Warning in fitTMB(TMBStruc): Model convergence problem; false convergence
## (8). See vignette('troubleshooting')
## Family: beta ( logit )
## Formula:
## fracLeft_tr ~ comparison * lessExpectedLeft * ageRegistration *
## trialnum + (1 | shortId) + (1 | stimuli)
## Data: fall_mo_vs_near
##
## AIC BIC logLik deviance df.resid
## NA NA NA NA -6
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## shortId (Intercept) 1.490e+01 3.859e+00
## stimuli (Intercept) 2.301e-13 4.797e-07
## Number of obs: 13, groups: shortId, 7; stimuli, 11
##
## Overdispersion parameter for beta family (): 2.12e+11
##
## Conditional model:
## Estimate
## (Intercept) 34.41331
## comparisonmostly-on:slightly-on -0.06905
## lessExpectedLeftTRUE -0.06412
## ageRegistration -4.45788
## trialnum -2.17283
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE -0.06412
## comparisonmostly-on:slightly-on:ageRegistration 1.14075
## lessExpectedLeftTRUE:ageRegistration -1.12009
## comparisonmostly-on:slightly-on:trialnum -0.99083
## lessExpectedLeftTRUE:trialnum 1.00068
## ageRegistration:trialnum 0.26850
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:ageRegistration -1.12009
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:trialnum 1.00068
## comparisonmostly-on:slightly-on:ageRegistration:trialnum 0.03565
## lessExpectedLeftTRUE:ageRegistration:trialnum -0.03737
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:ageRegistration:trialnum -0.03737
## Std. Error
## (Intercept) NA
## comparisonmostly-on:slightly-on NA
## lessExpectedLeftTRUE NA
## ageRegistration NA
## trialnum NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE NA
## comparisonmostly-on:slightly-on:ageRegistration NA
## lessExpectedLeftTRUE:ageRegistration NA
## comparisonmostly-on:slightly-on:trialnum NA
## lessExpectedLeftTRUE:trialnum NA
## ageRegistration:trialnum NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:ageRegistration NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:trialnum NA
## comparisonmostly-on:slightly-on:ageRegistration:trialnum NA
## lessExpectedLeftTRUE:ageRegistration:trialnum NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:ageRegistration:trialnum NA
## z value
## (Intercept) NA
## comparisonmostly-on:slightly-on NA
## lessExpectedLeftTRUE NA
## ageRegistration NA
## trialnum NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE NA
## comparisonmostly-on:slightly-on:ageRegistration NA
## lessExpectedLeftTRUE:ageRegistration NA
## comparisonmostly-on:slightly-on:trialnum NA
## lessExpectedLeftTRUE:trialnum NA
## ageRegistration:trialnum NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:ageRegistration NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:trialnum NA
## comparisonmostly-on:slightly-on:ageRegistration:trialnum NA
## lessExpectedLeftTRUE:ageRegistration:trialnum NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:ageRegistration:trialnum NA
## Pr(>|z|)
## (Intercept) NA
## comparisonmostly-on:slightly-on NA
## lessExpectedLeftTRUE NA
## ageRegistration NA
## trialnum NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE NA
## comparisonmostly-on:slightly-on:ageRegistration NA
## lessExpectedLeftTRUE:ageRegistration NA
## comparisonmostly-on:slightly-on:trialnum NA
## lessExpectedLeftTRUE:trialnum NA
## ageRegistration:trialnum NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:ageRegistration NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:trialnum NA
## comparisonmostly-on:slightly-on:ageRegistration:trialnum NA
## lessExpectedLeftTRUE:ageRegistration:trialnum NA
## comparisonmostly-on:slightly-on:lessExpectedLeftTRUE:ageRegistration:trialnum NA
QUESTION FOR PATRICK: These models don’t converge, and I’m not sure the correct procedure for proceeding in this case!
(No within-event tests)
(No within-event tests)
QUESTION FOR PATRICK (this whole section)
Comparisons are theoretically paired across the stay/fall distinction. Specifically, we expect each comparison to anticorrelate across the stay/fall distinction. Additionally, we expect that different comparisons should ‘come online’ at different points.
The goal here is to define a model that can support asking these questions. My best guess at this model would be:
support <- bind_rows(stay, fall)
#Original 'surprisingness' model
fitbeta_support <- glmmTMB(fracLeft_tr ~ lessExpectedLeft*ageRegistration*trialnum + (1|shortId) + (1|stimuli), data = support, family = beta_family(link = "logit"))
## Warning in nlminb(start = par, objective = fn, gradient = gr, control =
## control$optCtrl): NA/NaN function evaluation
summary(fitbeta_support)
## Family: beta ( logit )
## Formula:
## fracLeft_tr ~ lessExpectedLeft * ageRegistration * trialnum +
## (1 | shortId) + (1 | stimuli)
## Data: support
##
## AIC BIC logLik deviance df.resid
## -78.8 -39.5 50.4 -100.8 251
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## shortId (Intercept) 0.6224 0.7889
## stimuli (Intercept) 0.1827 0.4274
## Number of obs: 262, groups: shortId, 43; stimuli, 193
##
## Overdispersion parameter for beta family (): 2.6
##
## Conditional model:
## Estimate Std. Error
## (Intercept) 0.6642583 0.7236270
## lessExpectedLeftTRUE -1.1723782 0.9095106
## ageRegistration -0.0363447 0.0762599
## trialnum -0.0143734 0.0413827
## lessExpectedLeftTRUE:ageRegistration 0.0502460 0.0960252
## lessExpectedLeftTRUE:trialnum 0.0412715 0.0672363
## ageRegistration:trialnum -0.0014358 0.0045467
## lessExpectedLeftTRUE:ageRegistration:trialnum -0.0006354 0.0072634
## z value Pr(>|z|)
## (Intercept) 0.918 0.359
## lessExpectedLeftTRUE -1.289 0.197
## ageRegistration -0.477 0.634
## trialnum -0.347 0.728
## lessExpectedLeftTRUE:ageRegistration 0.523 0.601
## lessExpectedLeftTRUE:trialnum 0.614 0.539
## ageRegistration:trialnum -0.316 0.752
## lessExpectedLeftTRUE:ageRegistration:trialnum -0.088 0.930
# Add event/comparison terms
fitbeta_support_comparisons <- glmmTMB(fracLeft_tr ~ ageRegistration*lessExpectedLeft*event*comparison + ageRegistration*trialnum + (1|shortId) + (1|stimuli), data = support, family = beta_family(link = "logit"))
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in f(par, order = order, ...): value out of range in 'lgamma'
## Warning in fitTMB(TMBStruc): Model convergence problem; non-positive-
## definite Hessian matrix. See vignette('troubleshooting')
summary(fitbeta_support_comparisons)
## Family: beta ( logit )
## Formula:
## fracLeft_tr ~ ageRegistration * lessExpectedLeft * event * comparison +
## ageRegistration * trialnum + (1 | shortId) + (1 | stimuli)
## Data: support
##
## AIC BIC logLik deviance df.resid
## NA NA NA NA 209
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## shortId (Intercept) 0.62972 0.7935
## stimuli (Intercept) 0.01833 0.1354
## Number of obs: 262, groups: shortId, 43; stimuli, 193
##
## Overdispersion parameter for beta family (): 2.8
##
## Conditional model:
## Estimate
## (Intercept) -1.451e+00
## ageRegistration 8.992e-02
## lessExpectedLeftTRUE 2.294e+01
## eventstay 3.123e+00
## comparisonmostly-on:next-to 2.295e+00
## comparisonmostly-on:slightly-on 9.078e-02
## comparisonnear:next-to 1.851e+00
## comparisonnear:slightly-on -1.434e+02
## comparisonnext-to:slightly-on 1.135e+01
## trialnum 1.674e-02
## ageRegistration:lessExpectedLeftTRUE -1.843e+00
## ageRegistration:eventstay -1.841e-01
## lessExpectedLeftTRUE:eventstay -2.502e+01
## ageRegistration:comparisonmostly-on:next-to -2.348e-01
## ageRegistration:comparisonmostly-on:slightly-on -2.904e-02
## ageRegistration:comparisonnear:next-to -1.934e-01
## ageRegistration:comparisonnear:slightly-on 1.151e+01
## ageRegistration:comparisonnext-to:slightly-on -9.557e-01
## lessExpectedLeftTRUE:comparisonmostly-on:next-to -2.215e+01
## lessExpectedLeftTRUE:comparisonmostly-on:slightly-on -2.022e+01
## lessExpectedLeftTRUE:comparisonnear:next-to -2.132e+01
## lessExpectedLeftTRUE:comparisonnear:slightly-on 1.247e+02
## lessExpectedLeftTRUE:comparisonnext-to:slightly-on -3.429e+01
## eventstay:comparisonmostly-on:next-to -4.119e+00
## eventstay:comparisonmostly-on:slightly-on -4.276e+00
## eventstay:comparisonnear:next-to -2.802e+00
## eventstay:comparisonnear:slightly-on 1.428e+02
## eventstay:comparisonnext-to:slightly-on -1.169e+01
## ageRegistration:trialnum -3.518e-03
## ageRegistration:lessExpectedLeftTRUE:eventstay 1.958e+00
## ageRegistration:lessExpectedLeftTRUE:comparisonmostly-on:next-to 1.855e+00
## ageRegistration:lessExpectedLeftTRUE:comparisonmostly-on:slightly-on 1.573e+00
## ageRegistration:lessExpectedLeftTRUE:comparisonnear:next-to 1.751e+00
## ageRegistration:lessExpectedLeftTRUE:comparisonnear:slightly-on -1.003e+01
## ageRegistration:lessExpectedLeftTRUE:comparisonnext-to:slightly-on 2.866e+00
## ageRegistration:eventstay:comparisonmostly-on:next-to 3.745e-01
## ageRegistration:eventstay:comparisonmostly-on:slightly-on 3.696e-01
## ageRegistration:eventstay:comparisonnear:next-to 2.599e-01
## ageRegistration:eventstay:comparisonnear:slightly-on -1.157e+01
## ageRegistration:eventstay:comparisonnext-to:slightly-on 9.426e-01
## lessExpectedLeftTRUE:eventstay:comparisonmostly-on:next-to 2.296e+01
## lessExpectedLeftTRUE:eventstay:comparisonmostly-on:slightly-on 2.560e+01
## lessExpectedLeftTRUE:eventstay:comparisonnear:next-to 2.269e+01
## lessExpectedLeftTRUE:eventstay:comparisonnear:slightly-on -1.268e+02
## lessExpectedLeftTRUE:eventstay:comparisonnext-to:slightly-on 3.433e+01
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonmostly-on:next-to -1.854e+00
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonmostly-on:slightly-on -2.038e+00
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnear:next-to -1.875e+00
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnear:slightly-on 1.037e+01
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnext-to:slightly-on -2.791e+00
## Std. Error
## (Intercept) NA
## ageRegistration NA
## lessExpectedLeftTRUE NA
## eventstay NA
## comparisonmostly-on:next-to NA
## comparisonmostly-on:slightly-on NA
## comparisonnear:next-to NA
## comparisonnear:slightly-on NA
## comparisonnext-to:slightly-on NA
## trialnum NA
## ageRegistration:lessExpectedLeftTRUE NA
## ageRegistration:eventstay NA
## lessExpectedLeftTRUE:eventstay NA
## ageRegistration:comparisonmostly-on:next-to NA
## ageRegistration:comparisonmostly-on:slightly-on NA
## ageRegistration:comparisonnear:next-to NA
## ageRegistration:comparisonnear:slightly-on NA
## ageRegistration:comparisonnext-to:slightly-on NA
## lessExpectedLeftTRUE:comparisonmostly-on:next-to NA
## lessExpectedLeftTRUE:comparisonmostly-on:slightly-on NA
## lessExpectedLeftTRUE:comparisonnear:next-to NA
## lessExpectedLeftTRUE:comparisonnear:slightly-on NA
## lessExpectedLeftTRUE:comparisonnext-to:slightly-on NA
## eventstay:comparisonmostly-on:next-to NA
## eventstay:comparisonmostly-on:slightly-on NA
## eventstay:comparisonnear:next-to NA
## eventstay:comparisonnear:slightly-on NA
## eventstay:comparisonnext-to:slightly-on NA
## ageRegistration:trialnum NA
## ageRegistration:lessExpectedLeftTRUE:eventstay NA
## ageRegistration:lessExpectedLeftTRUE:comparisonmostly-on:next-to NA
## ageRegistration:lessExpectedLeftTRUE:comparisonmostly-on:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:comparisonnear:next-to NA
## ageRegistration:lessExpectedLeftTRUE:comparisonnear:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:comparisonnext-to:slightly-on NA
## ageRegistration:eventstay:comparisonmostly-on:next-to NA
## ageRegistration:eventstay:comparisonmostly-on:slightly-on NA
## ageRegistration:eventstay:comparisonnear:next-to NA
## ageRegistration:eventstay:comparisonnear:slightly-on NA
## ageRegistration:eventstay:comparisonnext-to:slightly-on NA
## lessExpectedLeftTRUE:eventstay:comparisonmostly-on:next-to NA
## lessExpectedLeftTRUE:eventstay:comparisonmostly-on:slightly-on NA
## lessExpectedLeftTRUE:eventstay:comparisonnear:next-to NA
## lessExpectedLeftTRUE:eventstay:comparisonnear:slightly-on NA
## lessExpectedLeftTRUE:eventstay:comparisonnext-to:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonmostly-on:next-to NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonmostly-on:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnear:next-to NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnear:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnext-to:slightly-on NA
## z value
## (Intercept) NA
## ageRegistration NA
## lessExpectedLeftTRUE NA
## eventstay NA
## comparisonmostly-on:next-to NA
## comparisonmostly-on:slightly-on NA
## comparisonnear:next-to NA
## comparisonnear:slightly-on NA
## comparisonnext-to:slightly-on NA
## trialnum NA
## ageRegistration:lessExpectedLeftTRUE NA
## ageRegistration:eventstay NA
## lessExpectedLeftTRUE:eventstay NA
## ageRegistration:comparisonmostly-on:next-to NA
## ageRegistration:comparisonmostly-on:slightly-on NA
## ageRegistration:comparisonnear:next-to NA
## ageRegistration:comparisonnear:slightly-on NA
## ageRegistration:comparisonnext-to:slightly-on NA
## lessExpectedLeftTRUE:comparisonmostly-on:next-to NA
## lessExpectedLeftTRUE:comparisonmostly-on:slightly-on NA
## lessExpectedLeftTRUE:comparisonnear:next-to NA
## lessExpectedLeftTRUE:comparisonnear:slightly-on NA
## lessExpectedLeftTRUE:comparisonnext-to:slightly-on NA
## eventstay:comparisonmostly-on:next-to NA
## eventstay:comparisonmostly-on:slightly-on NA
## eventstay:comparisonnear:next-to NA
## eventstay:comparisonnear:slightly-on NA
## eventstay:comparisonnext-to:slightly-on NA
## ageRegistration:trialnum NA
## ageRegistration:lessExpectedLeftTRUE:eventstay NA
## ageRegistration:lessExpectedLeftTRUE:comparisonmostly-on:next-to NA
## ageRegistration:lessExpectedLeftTRUE:comparisonmostly-on:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:comparisonnear:next-to NA
## ageRegistration:lessExpectedLeftTRUE:comparisonnear:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:comparisonnext-to:slightly-on NA
## ageRegistration:eventstay:comparisonmostly-on:next-to NA
## ageRegistration:eventstay:comparisonmostly-on:slightly-on NA
## ageRegistration:eventstay:comparisonnear:next-to NA
## ageRegistration:eventstay:comparisonnear:slightly-on NA
## ageRegistration:eventstay:comparisonnext-to:slightly-on NA
## lessExpectedLeftTRUE:eventstay:comparisonmostly-on:next-to NA
## lessExpectedLeftTRUE:eventstay:comparisonmostly-on:slightly-on NA
## lessExpectedLeftTRUE:eventstay:comparisonnear:next-to NA
## lessExpectedLeftTRUE:eventstay:comparisonnear:slightly-on NA
## lessExpectedLeftTRUE:eventstay:comparisonnext-to:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonmostly-on:next-to NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonmostly-on:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnear:next-to NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnear:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnext-to:slightly-on NA
## Pr(>|z|)
## (Intercept) NA
## ageRegistration NA
## lessExpectedLeftTRUE NA
## eventstay NA
## comparisonmostly-on:next-to NA
## comparisonmostly-on:slightly-on NA
## comparisonnear:next-to NA
## comparisonnear:slightly-on NA
## comparisonnext-to:slightly-on NA
## trialnum NA
## ageRegistration:lessExpectedLeftTRUE NA
## ageRegistration:eventstay NA
## lessExpectedLeftTRUE:eventstay NA
## ageRegistration:comparisonmostly-on:next-to NA
## ageRegistration:comparisonmostly-on:slightly-on NA
## ageRegistration:comparisonnear:next-to NA
## ageRegistration:comparisonnear:slightly-on NA
## ageRegistration:comparisonnext-to:slightly-on NA
## lessExpectedLeftTRUE:comparisonmostly-on:next-to NA
## lessExpectedLeftTRUE:comparisonmostly-on:slightly-on NA
## lessExpectedLeftTRUE:comparisonnear:next-to NA
## lessExpectedLeftTRUE:comparisonnear:slightly-on NA
## lessExpectedLeftTRUE:comparisonnext-to:slightly-on NA
## eventstay:comparisonmostly-on:next-to NA
## eventstay:comparisonmostly-on:slightly-on NA
## eventstay:comparisonnear:next-to NA
## eventstay:comparisonnear:slightly-on NA
## eventstay:comparisonnext-to:slightly-on NA
## ageRegistration:trialnum NA
## ageRegistration:lessExpectedLeftTRUE:eventstay NA
## ageRegistration:lessExpectedLeftTRUE:comparisonmostly-on:next-to NA
## ageRegistration:lessExpectedLeftTRUE:comparisonmostly-on:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:comparisonnear:next-to NA
## ageRegistration:lessExpectedLeftTRUE:comparisonnear:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:comparisonnext-to:slightly-on NA
## ageRegistration:eventstay:comparisonmostly-on:next-to NA
## ageRegistration:eventstay:comparisonmostly-on:slightly-on NA
## ageRegistration:eventstay:comparisonnear:next-to NA
## ageRegistration:eventstay:comparisonnear:slightly-on NA
## ageRegistration:eventstay:comparisonnext-to:slightly-on NA
## lessExpectedLeftTRUE:eventstay:comparisonmostly-on:next-to NA
## lessExpectedLeftTRUE:eventstay:comparisonmostly-on:slightly-on NA
## lessExpectedLeftTRUE:eventstay:comparisonnear:next-to NA
## lessExpectedLeftTRUE:eventstay:comparisonnear:slightly-on NA
## lessExpectedLeftTRUE:eventstay:comparisonnext-to:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonmostly-on:next-to NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonmostly-on:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnear:next-to NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnear:slightly-on NA
## ageRegistration:lessExpectedLeftTRUE:eventstay:comparisonnext-to:slightly-on NA
#This doesn't converge!
Once specified, we’d like to ask the following questions:
Is it the case that predictions for STAY events ‘anti correlate’ with FALL events?
Intuitive approximate magnitude ordering based on adult understanding of physics: mostly-near, mostly-next, mostly-slightly, slightly-near, slightly-next, next-near. What orders do we actually see w/i kids?
QUESTION: FOR PATRICK: I’d like to be able to specify a planned comparison of these orderings, but I’m not sure how!
HYPOTHESIS SUGGESTIONS FROM KIM: Stage theory: Might make crude predictions based on which should stay vs. fall—if only mostly-on should stay put, then mostly-slightly, mostly-next, mostly-near are bigger differences and might show bigger preferences than the others, since in these cases we have one “expected” and one “unexpected” outcome. If mostly-on and slightly-on both stay put, then expect instead mostly-next, mostly-near, slightly-next, slightly-near to show bigger preferences than the others. Etc. Project preference vectors onto [1,1,1,0,0,0], [0,1,1,0,1,1], [0,0,0,1,1,1] and plot transformed coords by age, where transformed coords basically represent “how much like a partial-support-knower do you act,” “how much like an any-support-from-below-knower do you act,” “how much like an any-contact-is-support-knower do you act.” Overall, across kids, what preference vectors do we see, binned by age group? (plot mean vectors over time?) Could imagine getting closer to [1,1,1,0,0,0] as above or getting better on everything, even the ones that aren’t disparate on possibility (i.e. might expect a baby who “really gets it” to not care about something staying in midair vs. staying when placed next to a cabinet, because they’re both obviously impossible; OR might expect a baby who “really gets it” to differentiate even this subtle difference in probability because hey, maybe it’s a sticky table or my perception is noisy or something. Nothing in Baillargeon stage theory predicts getting better at this comparison with age, so it’s a nice test.
Separate from the question about physics, we can use the ‘control’ events to ask about the dynamics of looking time in these experiments. For these, we’ll return to the main dataset and split off the control events:
controldata <- pilotdata %>%
filter(event == "same"|event == "calibration"|event=="salience")
#Keep all the subject-level variables we'll be caring about!
subjlevel <- controldata %>%
select(-c("videonum","calibration", "duration","left","ooftime", "right", "stimuli","trialnum","event","concept","object","unexpectedLeft", "unexpectedRight","lessExpectedLeft","outcomeLeft", "outcomeRight", "comparison", "fracLeft", "fracLessExpected", "fracAbs", "metric", "totalLT", "child.proquint", "fracLeft_tr")) %>%
group_by(shortId)%>%
slice(1)%>%
ungroup() %>%
select(-X)
Then, we’ll calculate a series of per-child/per-session measures:
Side bias: fractional looking time to R during ‘same’ events (3 events/session)
sidebias <- controldata %>%
filter(event == "same")%>%
group_by(shortId)%>%
dplyr::summarize(sidebias = 1-mean(fracLeft_tr))
sidebias = merge(sidebias, subjlevel, by=c("shortId"))
Stickiness: |fLT – 0.5| during ‘same’ events (3 events/session)
stickiness <- controldata %>%
filter(event == "same")%>%
group_by(shortId)%>%
dplyr::summarize(stickiness = abs(mean(fracLeft_tr)-0.5))
stickiness = merge(stickiness, subjlevel, by=c("shortId"))
Sensitivity: fLT to more interesting during ‘salience’ events (3 events/session)
# sensitivity <- controldata %>%
# filter(event == "salience")%>%
# mutate(fracLessExpected_tr)
# group_by(shortId)%>%
# dplyr::summarize(sensitivity = mean(fracLeft_tr))
Total looking time (all events)
How stable are control measures across sessions, and do they change with age? Partition variance for each measure: measure ~ age + (1|child/session). (Fraction of total variance explained = intraclass correlation coefficient.) Report coefficient of age & test for significance. Display overall distributions of these measures (one mean per child) and plot measures against sessions (e.g., one line per child)
Are controls well predicted by mood measures? Regress each control measure (using means per session, and |(mean fLT to right) - 0.5| for side bias as a measure of strength of bias to either side) using model: measure ~ 1 + parentscore + childscore + childactivity + timesincewaking + timeuntilsleep + trial# + (1|child). Report overall significance of model (based on F value) and coefficients of individual predictors. Predictors: Parent score: mean of z-scored parent items Child score: mean of z-scored child items rested, healthy, happy Child activity: calm-active score Time since waking up Time until due for sleep (“overdue” = 0, “no schedule” = missing data.)
How well does side bias during control stimuli predict side bias during tests? sb_test ~ sb_control + (1|child). Use mean looking time to right across all same/test trials in a session (sb_control includes up to 3 ‘same’ trials, sb_test includes all other trials except calibration). Report coefficient & test for significance of sb_control.
How well does sensitivity control predict fLT on events? fLT ~ sensitivity + (1|child). One mean fLT value per session, including all tests with unambiguous expected outcome to adults (excluding calibration, same, table up-continue, stay/fall without mostly-on as one of the outcomes). Report coefficient & test for significance of sensitivity.
How well does stickiness predict noisiness of data? variance_salience ~ stickiness + (1|child). Report coefficient & test for significance of stickiness.
How stable are kids’ looking patterns on test stimuli across sessions? How much do the specific videos (e.g. object choices, backgrounds) matter in comparison to the event types? fLT_trial ~ concept/event/comparison + concept:event:comparison:object + concept:event:comparison:object:cameraangle + concept:event:comparison:object:background + concept:event:comparison:object:flip + trial# + (1|child/comparison) + (1|child:session). Report fraction variance explained by child/comparison & by session, coefficient for trial# with significance, and use ANOVA to report overall effects of object, camera angle, background, & flip across children. Do preferences change over the course of the 15 sessions, across children? fLT_trial ~ concept/event/comparison + trial# + session# + (1|child/comparison). Report coefficient for session# and test for significance.
(Guidance for future studies, not specific hypotheses about physics/looking personalities)
Summary of children’s preferences on these events:
Bin together each comparison type (each physics comparison + salience control), all measurements per child. Separately bin by concept (gravity, support, inertia). Add total time looking at target and distractor across all trials to get an fLT measurement.
Bootstrap confidence intervals per child by resampling sessions, then trials.
What fraction of children have a preference for either the expected or unexpected event with 95% CI not overlapping 50% looking, for each grouping? (Just informative for design of future studies - how much data do you need to see individual preferences? May also want to calculate for smaller subsets of the data, e.g. how many sessions until X% children show individually-significant results when group-level effect size is Y.)